Improving Policies without Measuring Merits
نویسندگان
چکیده
Performing policy iteration in dynamic programming should only require knowledge of relative rather than absolute measures of the utility of actions { what Baird (1993) calls the advantages of actions at states. Nevertheless, existing methods in dynamic programming (including Baird's) compute some form of absolute utility function. For smooth problems, advantages satisfy two di erential consistency conditions (including the requirement that they be free of curl), and we show that enforcing these can lead to appropriate policy improvement solely in terms of advantages.
منابع مشابه
The Point of Care: Measures of patients' experience in hospital - The King's Fund, July 2009
n The King's Fund Point of Care programme aims to transform the quality of patients' experience in acute hospitals. This is against the background of the Department of Health's current range of policies designed to improve patients' experience of health care in England. n Such an ambitious transformation requires the involvement of all frontline staff and will need first class leadership. n Key...
متن کامل[Rationing through waiting lists: measuring improvement and possible implications].
This paper analyzes the main policy initiatives for improving waiting lists in health care. The authors begin by describing strategies to reduce either waiting time or length of the list. They distinguish between demand-side and supply-side strategies. They proceed to discuss policies for improving the "quality" of waiting time. For each policy, they present both the expected effect and the ind...
متن کاملOC09079 Running head: STUDENT EVALUATIONS OF TEACHING Student Evaluations of Teaching: Perceived Merits and Disadvantages, and Suggestions for Improving the Assessment Method
The use of student evaluations of teaching (SET) to assess instructors’ effectiveness is one of the most common and controversial practices in higher education. While a number of researchers have concluded that SET’s are a valid, reliable, and worthwhile means of assessment (Wachtel, 2005; Koon & Murray, 1995; Centra, 1993), detractors contend that the method is too narrow in focus and open to ...
متن کاملAlternative Courts and Drug Treatment: Finding a Rehabilitative Solution for Addicts in a Retributive System
Sentencing drug crimes and treating drug-addicted defendants often stem from contradictory theories of punishment. In the late twentieth century, courts traded rehabilitation for retributive ideals to fight the " War on Drugs. " However, beginning with the Miami-Dade Drug Court, treatment and rehabilitation have returned to the forefront of sentencing policy in traditional and alternative drug ...
متن کامل1 Data Management for Distributed Scientific Collaborations Using a Rule Engine
A virtual organization (VO) consists of groups of many scientists and organizations from geographically distributed regions that pool their computing and storage resources together in order to achieve some common scientific goal via grid computing. Such collaborations often result in the generation of a vast amount of shared data from experimental apparatus or statistical simulation. In order t...
متن کامل